1. Field of the Invention
The present invention relates to a translation apparatus and a storage medium storing therein a translation apparatus controlling program which are applied to word processors, personal computers, portable information processors and the like for translating an inputted source document and outputting a translation.
2. Description of the Related Arts
In recent years, a number of computer-based translation machines have been developed. However, the performance of the computer-based translation machines is not comparable to that of professional translators. One reason for this is that a source document typically contains layout information such as line feeding, cascading and itemization in addition to texts. The professional translators can readily understand the meanings of such layout information, while the translation machines cannot.
If the layout information in the source document cannot be detected, it is impossible to extract the layout information from the source document. In addition, a translation range in the source document cannot correctly be specified, resulting in mistranslation. If an itemization tag is mistakenly regarded as part of a sentence, for example, an erroneous sentence analysis may result.
As one conventional translation method utilizing layout information, Japanese Unexamined Patent Publication No. HEI 2(1990)-208775, for example, proposes a machine translation method in which non-sentence information such as itemized text portions, mathematical expressions and titles are detected through comparison with pattern matching data in a non-sentence information processing section and a translation is produced in consideration of the detected non-sentence information for improvement of translation accuracy.
Further, Japanese Unexamined Patent Publication No. HEI 5(1993)-303589 proposes a translation machine which is capable of detecting headline portions, paragraph text portions and itemized text portions as syntax patterns by layout analyzing means and utilizes different sentence generation rules depending on the syntax patterns for translation thereof.
However, the arts utilizing the layout information as disclosed in Japanese Unexamined Patent Publications No. HEI 2(1990)-208775 and HEI 5(1993)-393589 are based on the assumption that source documents are composed in a fixed format and, therefore, cannot flexibly cope with various itemization formats. Electronic mails, which have recently been prevailing, are composed in various formats. For example, in an electronic mail dialog, quotation tags are added to sentences as follows:
______________________________________ &gt;&gt; &gt;&gt; &gt;&gt; This is a test mail. Is there anything wrong? &gt;&gt; &gt;&gt; I received your mail. Everything seems fine. &gt;&gt; Thank you. I'm relieved. See you on Sunday! No problem. See You! : or John : This is a test mail. Is There anything wrong? Mary &gt;&gt; I received your mail. Everything seems fine. John : Thank you. I'm relieved. See you on Sunday! Mary &gt;&gt; No problem. See You! : ______________________________________
In other fields, various layout formats are employed for production of documents depending on producer' preferences. Although the professional translators can readily understand various layout information, the conventional translation machines are not designed to properly extract the layout information, thereby failing to correctly specify a translation range.